String Reconstruction from Substring Compositions
نویسندگان
چکیده
منابع مشابه
String Reconstruction from Substring Compositions
Motivated by mass-spectrometry protein sequencing, we consider the problem of reconstructing a string from the multisets of its substring composition. We show that all strings of length 7, one less than a prime and one less than twice a prime, can be reconstructed uniquely up to reversal. For all other lengths, we show that unique reconstruction is not always possible and provide sometimes-tigh...
متن کاملTight Bounds for String Reconstruction Using Substring Queries
We resolve two open problems presented in [8]. First, we consider the problem of reconstructing an unknown string T over a fixed alphabet using queries of the form “does the string S appear in T ?” for some query string S. We show that every non-adaptive algorithm must make Ω(ǫn) queries in order to reconstruct a 1− ǫ fraction of the strings of length n. The second problem is reconstructing a s...
متن کاملMore Efficient Algorithms for Closest String and Substring Problems
The closest string and substring problems find applications in PCR primer design, genetic probe design, motif finding, and antisense drug design. For their importance, the two problems have been extensively studied recently in computational biology. Unfortunately both problems are NP-complete. Researchers have developed both fixed-parameter algorithms and approximation algorithms for the two pr...
متن کاملDegenerate String Reconstruction from Cover Arrays
Regularities in degenerate strings have recently been a matter of interest because of their use in the fields of molecular biology, musical text analysis, cryptanalysis and so on. In this paper, we study the problem of reconstructing a degenerate string from a cover array. We present two efficient algorithms to reconstruct a degenerate string from a valid cover array one using an unbounded alph...
متن کاملThe Average Common Substring Approach to Phylogenomic Reconstruction
We describe a novel method for efficient reconstruction of phylogenetic trees, based on sequences of whole genomes or proteomes, whose lengths may greatly vary. The core of our method is a new measure of pairwise distances between sequences. This measure is based on computing the average lengths of maximum common substrings, which is intrinsically related to information theoretic tools (Kullbac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SIAM Journal on Discrete Mathematics
سال: 2015
ISSN: 0895-4801,1095-7146
DOI: 10.1137/140962486